Overview

Dataset statistics

Number of variables8
Number of observations900
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory56.4 KiB
Average record size in memory64.1 B

Variable types

Numeric7
Categorical1

Alerts

Area is highly correlated with MajorAxisLength and 4 other fieldsHigh correlation
MajorAxisLength is highly correlated with Area and 6 other fieldsHigh correlation
MinorAxisLength is highly correlated with Area and 4 other fieldsHigh correlation
Eccentricity is highly correlated with MajorAxisLength and 3 other fieldsHigh correlation
ConvexArea is highly correlated with Area and 5 other fieldsHigh correlation
Perimeter is highly correlated with Area and 6 other fieldsHigh correlation
Extent is highly correlated with MajorAxisLength and 3 other fieldsHigh correlation
Class is highly correlated with Area and 5 other fieldsHigh correlation
Class is uniformly distributed Uniform
Area has unique values Unique
MajorAxisLength has unique values Unique
MinorAxisLength has unique values Unique
Eccentricity has unique values Unique
Extent has unique values Unique
Perimeter has unique values Unique

Reproduction

Analysis started2023-03-13 18:06:14.378670
Analysis finished2023-03-13 18:06:26.438119
Duration12.06 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

Area
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct900
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean87804.12778
Minimum25387
Maximum235047
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 KiB
2023-03-13T14:06:26.576037image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum25387
5-th percentile42242.95
Q159348
median78902
Q3105028.25
95-th percentile171256.4
Maximum235047
Range209660
Interquartile range (IQR)45680.25

Descriptive statistics

Standard deviation39002.11139
Coefficient of variation (CV)0.4441945086
Kurtosis1.074072908
Mean87804.12778
Median Absolute Deviation (MAD)21615
Skewness1.17523739
Sum79023715
Variance1521164693
MonotonicityNot monotonic
2023-03-13T14:06:26.714933image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
875241
 
0.1%
1180431
 
0.1%
1241661
 
0.1%
1337841
 
0.1%
804811
 
0.1%
634911
 
0.1%
1097911
 
0.1%
892361
 
0.1%
1522671
 
0.1%
897211
 
0.1%
Other values (890)890
98.9%
ValueCountFrequency (%)
253871
0.1%
269081
0.1%
282161
0.1%
312371
0.1%
312751
0.1%
320971
0.1%
335651
0.1%
336151
0.1%
336621
0.1%
345591
0.1%
ValueCountFrequency (%)
2350471
0.1%
2250431
0.1%
2230751
0.1%
2229151
0.1%
2184591
0.1%
2109231
0.1%
2082641
0.1%
2067201
0.1%
2066891
0.1%
2054971
0.1%

MajorAxisLength
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct900
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean430.9299505
Minimum225.629541
Maximum997.2919406
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 KiB
2023-03-13T14:06:26.856232image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum225.629541
5-th percentile280.3049189
Q1345.4428978
median407.8039511
Q3494.187014
95-th percentile648.239285
Maximum997.2919406
Range771.6623996
Interquartile range (IQR)148.7441162

Descriptive statistics

Standard deviation116.0351206
Coefficient of variation (CV)0.269266781
Kurtosis1.326807969
Mean430.9299505
Median Absolute Deviation (MAD)71.09952165
Skewness0.9895441336
Sum387836.9554
Variance13464.14922
MonotonicityNot monotonic
2023-03-13T14:06:26.997284image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
442.24601141
 
0.1%
493.76524451
 
0.1%
525.94535681
 
0.1%
581.28694771
 
0.1%
481.06395341
 
0.1%
326.63288241
 
0.1%
477.03350161
 
0.1%
389.68169181
 
0.1%
598.96763581
 
0.1%
530.15657431
 
0.1%
Other values (890)890
98.9%
ValueCountFrequency (%)
225.6295411
0.1%
227.29379171
0.1%
232.42784751
0.1%
243.03828021
0.1%
245.40129481
0.1%
245.75578141
0.1%
246.76361151
0.1%
249.74022661
0.1%
251.13338441
0.1%
251.74224621
0.1%
ValueCountFrequency (%)
997.29194061
0.1%
984.04549121
0.1%
949.66267181
0.1%
843.95665341
0.1%
820.7240221
0.1%
772.9568771
0.1%
769.42514881
0.1%
755.01291411
0.1%
746.1453411
0.1%
740.10870991
0.1%

MinorAxisLength
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct900
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean254.4881329
Minimum143.7108718
Maximum492.2752785
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 KiB
2023-03-13T14:06:27.183622image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum143.7108718
5-th percentile184.7727974
Q1219.1111265
median247.8484087
Q3279.8885746
95-th percentile348.351003
Maximum492.2752785
Range348.5644067
Interquartile range (IQR)60.77744813

Descriptive statistics

Standard deviation49.98890171
Coefficient of variation (CV)0.1964292053
Kurtosis0.953915212
Mean254.4881329
Median Absolute Deviation (MAD)29.982991
Skewness0.8000493645
Sum229039.3196
Variance2498.890294
MonotonicityNot monotonic
2023-03-13T14:06:27.323499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
253.2911551
 
0.1%
308.00012691
 
0.1%
304.15646861
 
0.1%
300.6356861
 
0.1%
217.5611511
 
0.1%
248.32244931
 
0.1%
294.53384711
 
0.1%
295.33193561
 
0.1%
331.24901721
 
0.1%
223.49993341
 
0.1%
Other values (890)890
98.9%
ValueCountFrequency (%)
143.71087181
0.1%
144.6186721
0.1%
150.24558191
0.1%
155.02574281
0.1%
156.39554551
0.1%
157.99041791
0.1%
161.70236571
0.1%
166.59355041
0.1%
167.32833441
0.1%
167.70849081
0.1%
ValueCountFrequency (%)
492.27527851
0.1%
440.49712751
0.1%
414.18832551
0.1%
413.92747321
0.1%
412.38281681
0.1%
411.8103691
0.1%
408.53561881
0.1%
403.71932751
0.1%
402.28327081
0.1%
400.80166031
0.1%

Eccentricity
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct900
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.78154215
Minimum0.348729642
Maximum0.96212444
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 KiB
2023-03-13T14:06:27.501710image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.348729642
5-th percentile0.602937743
Q10.741766254
median0.798846044
Q30.8425710238
95-th percentile0.8932430856
Maximum0.96212444
Range0.613394798
Interquartile range (IQR)0.1008047697

Descriptive statistics

Standard deviation0.09031840993
Coefficient of variation (CV)0.1155643492
Kurtosis2.492121436
Mean0.78154215
Median Absolute Deviation (MAD)0.049858966
Skewness-1.327503246
Sum703.387935
Variance0.008157415173
MonotonicityNot monotonic
2023-03-13T14:06:27.634901image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.8197383921
 
0.1%
0.7816011931
 
0.1%
0.8158211331
 
0.1%
0.8558707271
 
0.1%
0.8918912371
 
0.1%
0.6496313311
 
0.1%
0.7866273891
 
0.1%
0.6523940551
 
0.1%
0.8331591841
 
0.1%
0.906794261
 
0.1%
Other values (890)890
98.9%
ValueCountFrequency (%)
0.3487296421
0.1%
0.3692124591
0.1%
0.4183819721
0.1%
0.4197537071
0.1%
0.4323073451
0.1%
0.444950091
0.1%
0.4585446981
0.1%
0.4601212091
0.1%
0.4920869341
0.1%
0.4961182671
0.1%
ValueCountFrequency (%)
0.962124441
0.1%
0.9510822441
0.1%
0.9280936151
0.1%
0.9277371161
0.1%
0.9256349341
0.1%
0.9237703641
0.1%
0.9223830261
0.1%
0.9219392841
0.1%
0.9177085431
0.1%
0.9148402111
0.1%

ConvexArea
Real number (ℝ≥0)

HIGH CORRELATION

Distinct896
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean91186.09
Minimum26139
Maximum278217
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 KiB
2023-03-13T14:06:27.928235image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum26139
5-th percentile44044.6
Q161513.25
median81651
Q3108375.75
95-th percentile174766.05
Maximum278217
Range252078
Interquartile range (IQR)46862.5

Descriptive statistics

Standard deviation40769.29013
Coefficient of variation (CV)0.4470998826
Kurtosis1.427257973
Mean91186.09
Median Absolute Deviation (MAD)22667.5
Skewness1.242904042
Sum82067481
Variance1662135018
MonotonicityNot monotonic
2023-03-13T14:06:28.075559image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
707192
 
0.2%
817182
 
0.2%
923172
 
0.2%
499962
 
0.2%
586321
 
0.1%
648921
 
0.1%
1127661
 
0.1%
920141
 
0.1%
1573531
 
0.1%
952521
 
0.1%
Other values (886)886
98.4%
ValueCountFrequency (%)
261391
0.1%
286071
0.1%
303161
0.1%
325641
0.1%
335401
0.1%
336991
0.1%
347871
0.1%
353761
0.1%
357941
0.1%
358241
0.1%
ValueCountFrequency (%)
2782171
0.1%
2390931
0.1%
2291951
0.1%
2282591
0.1%
2271701
0.1%
2259161
0.1%
2255921
0.1%
2215271
0.1%
2213961
0.1%
2199521
0.1%

Extent
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct900
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6995079264
Minimum0.379856115
Maximum0.835454545
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 KiB
2023-03-13T14:06:28.206777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.379856115
5-th percentile0.6116800706
Q10.6708690967
median0.707366958
Q30.7349914172
95-th percentile0.7723654941
Maximum0.835454545
Range0.45559843
Interquartile range (IQR)0.0641223205

Descriptive statistics

Standard deviation0.05346820029
Coefficient of variation (CV)0.07643687551
Kurtosis3.341383681
Mean0.6995079264
Median Absolute Deviation (MAD)0.0308770895
Skewness-1.151504751
Sum629.5571338
Variance0.002858848442
MonotonicityNot monotonic
2023-03-13T14:06:28.353582image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.7586505791
 
0.1%
0.6755969411
 
0.1%
0.7289988491
 
0.1%
0.6795895561
 
0.1%
0.7149735711
 
0.1%
0.7343565661
 
0.1%
0.7447800071
 
0.1%
0.6969928921
 
0.1%
0.5940504061
 
0.1%
0.5474330971
 
0.1%
Other values (890)890
98.9%
ValueCountFrequency (%)
0.3798561151
0.1%
0.4141537481
0.1%
0.4541889291
0.1%
0.4910020091
0.1%
0.4914600951
0.1%
0.4969369791
0.1%
0.5005460351
0.1%
0.5073520751
0.1%
0.5222787751
0.1%
0.5262596351
0.1%
ValueCountFrequency (%)
0.8354545451
0.1%
0.8306322251
0.1%
0.8243192251
0.1%
0.8173886141
0.1%
0.8122654441
0.1%
0.8002551941
0.1%
0.7967302971
0.1%
0.7943713791
0.1%
0.7929986571
0.1%
0.7927719261
0.1%

Perimeter
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct900
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1165.906636
Minimum619.074
Maximum2697.753
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 KiB
2023-03-13T14:06:28.543891image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum619.074
5-th percentile804.51845
Q1966.41075
median1119.509
Q31308.38975
95-th percentile1661.5969
Maximum2697.753
Range2078.679
Interquartile range (IQR)341.979

Descriptive statistics

Standard deviation273.7643154
Coefficient of variation (CV)0.2348080945
Kurtosis1.74470613
Mean1165.906636
Median Absolute Deviation (MAD)172.495
Skewness1.01776109
Sum1049315.972
Variance74946.9004
MonotonicityNot monotonic
2023-03-13T14:06:28.686276image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1184.041
 
0.1%
1394.11
 
0.1%
1388.6841
 
0.1%
1502.6611
 
0.1%
1219.1051
 
0.1%
950.2971
 
0.1%
1290.2391
 
0.1%
1144.0361
 
0.1%
1570.5021
 
0.1%
1295.3771
 
0.1%
Other values (890)890
98.9%
ValueCountFrequency (%)
619.0741
0.1%
678.8151
0.1%
683.0041
0.1%
699.4151
0.1%
713.7751
0.1%
713.941
0.1%
718.8471
0.1%
719.9351
0.1%
727.5611
0.1%
734.1021
0.1%
ValueCountFrequency (%)
2697.7531
0.1%
2352.0291
0.1%
2303.691
0.1%
2289.8891
0.1%
2253.5571
0.1%
2098.2631
0.1%
1947.461
0.1%
1942.051
0.1%
1893.4141
0.1%
1876.3071
0.1%

Class
Categorical

HIGH CORRELATION
UNIFORM

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.2 KiB
Kecimen
450 
Besni
450 

Length

Max length7
Median length6
Mean length6
Min length5

Characters and Unicode

Total characters5400
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKecimen
2nd rowKecimen
3rd rowKecimen
4th rowKecimen
5th rowKecimen

Common Values

ValueCountFrequency (%)
Kecimen450
50.0%
Besni450
50.0%

Length

2023-03-13T14:06:28.813557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-03-13T14:06:28.947709image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
kecimen450
50.0%
besni450
50.0%

Most occurring characters

ValueCountFrequency (%)
e1350
25.0%
i900
16.7%
n900
16.7%
K450
 
8.3%
c450
 
8.3%
m450
 
8.3%
B450
 
8.3%
s450
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4500
83.3%
Uppercase Letter900
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1350
30.0%
i900
20.0%
n900
20.0%
c450
 
10.0%
m450
 
10.0%
s450
 
10.0%
Uppercase Letter
ValueCountFrequency (%)
K450
50.0%
B450
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5400
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1350
25.0%
i900
16.7%
n900
16.7%
K450
 
8.3%
c450
 
8.3%
m450
 
8.3%
B450
 
8.3%
s450
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII5400
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e1350
25.0%
i900
16.7%
n900
16.7%
K450
 
8.3%
c450
 
8.3%
m450
 
8.3%
B450
 
8.3%
s450
 
8.3%

Interactions

2023-03-13T14:06:25.227659image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:19.365013image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:20.506977image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:21.463434image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:22.447593image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:23.366189image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:24.343815image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:25.343528image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:19.523809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:20.637528image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:21.593707image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:22.573669image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:23.483464image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:24.465943image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:25.473691image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:19.787419image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:20.793565image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:21.738556image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:22.725867image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:23.617685image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:24.603432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:25.605129image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:19.934753image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:20.934023image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:21.879008image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:22.863821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:23.855738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:24.736425image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:25.729411image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:20.088747image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:21.078200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:22.013498image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:22.994817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:23.975819image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:24.866352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:25.851681image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:20.236029image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:21.210112image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:22.146391image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:23.125191image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:24.096014image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:24.983831image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:25.967242image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:20.381948image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:21.333662image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:22.312536image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:23.243592image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:24.223625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-03-13T14:06:25.105309image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2023-03-13T14:06:29.036821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-03-13T14:06:29.227200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-03-13T14:06:29.368679image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-03-13T14:06:29.517582image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2023-03-13T14:06:26.143748image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-03-13T14:06:26.348785image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

AreaMajorAxisLengthMinorAxisLengthEccentricityConvexAreaExtentPerimeterClass
087524442.246011253.2911550.819738905460.7586511184.040Kecimen
175166406.690687243.0324360.801805787890.6841301121.786Kecimen
290856442.267048266.3283180.798354937170.6376131208.575Kecimen
345928286.540559208.7600420.684989473360.699599844.162Kecimen
479408352.190770290.8275330.564011814630.7927721073.251Kecimen
549242318.125407200.1221200.777351513680.658456881.836Kecimen
642492310.146072176.1314490.823099439040.665894823.796Kecimen
760952332.455472235.4298350.706058623290.743598933.366Kecimen
842256323.189607172.5759260.845499447430.698031849.728Kecimen
964380366.964842227.7716150.784056661250.664376981.544Kecimen

Last rows

AreaMajorAxisLengthMinorAxisLengthEccentricityConvexAreaExtentPerimeterClass
89085646469.774755238.5393840.861490926730.6810441226.892Besni
891107486462.813134296.0912380.7685711089140.7599671235.078Besni
892149703637.873030304.6225320.8785991545490.5938051596.356Besni
893187391660.655588362.3150070.8362051897990.7139471682.478Besni
894115272511.472036291.5913490.8215741197730.6247601392.653Besni
89583248430.077308247.8386950.817263858390.6687931129.072Besni
89687350440.735698259.2931490.808629908990.6364761214.252Besni
89799657431.706981298.8373230.7216841062640.7410991292.828Besni
89893523476.344094254.1760540.845739976530.6587981258.548Besni
89985609512.081774215.2719760.907345891970.6320201272.862Besni